For help with Rmarkdown for reports, see this white paper from Carnegie Mellon University’s Department of Statistics and Data Science.
For each the seven statistical distributions we covered in the last assignment (Normal, Student’s \(t\), \(\chi ^ 2\), \(F\), Binomial, Negative Binomial, and Poisson),
i <- rnorm(10000, 2, sqrt(5))
ii <- rt(10000, 4, 0)
iii <- rchisq(10000, 2, 0)
iv <- rf(10000, 90, 12, 0)
v <- rbinom(10000, 9, 2/3)
vi <- rnbinom(10000, 5, 0.5)
vii <- rpois(10000, 3)
summary(i[1:6])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.7467 1.6534 2.2234 2.9999 4.6863 5.8350
summary(ii[1:6])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.59546 -0.08433 0.50154 1.12610 1.97938 4.15662
summary(iii[1:6])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.1071 1.0867 1.5758 2.3137 3.1641 6.0063
summary(iv[1:6])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.6320 0.8816 0.9870 1.2937 1.4852 2.6583
summary(v[1:6])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 4.00 5.00 5.00 6.00 7.25 9.00
summary(vi[1:6])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3.000 4.250 5.500 5.333 6.750 7.000
summary(vii[1:6])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 1.250 2.000 1.833 2.000 3.000
ii. plot the histogram of the subset, and
hist(i[1:6])
hist(ii[1:6])
hist(iii[1:6])
hist(iv[1:6])
hist(v[1:6])
hist(vi[1:6])
hist(vii[1:6])
iii. plot the estimated density of this subset.
plot(density(i[1:6]))
plot(density(ii[1:6]))
plot(density(iii[1:6]))
plot(density(iv[1:6]))
plot(density(v[1:6]))
plot(density(vi[1:6]))
plot(density(vii[1:6]))
3. Repeat Item 2 for the first \(N = 10,\ 20,\ 30,\ \text{and}\ 50\) values from the random vector you generated in Item 1. Remark on the changing behaviour as the sample size increases.
summary(i[1:10])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.8288 0.8109 1.8215 2.1669 2.8453 5.8350
summary(ii[1:10])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -1.24151 -0.21697 0.03333 0.63274 1.10529 4.15662
summary(iii[1:10])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.1071 0.9345 1.2519 1.8570 1.9656 6.0063
summary(iv[1:10])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.6320 0.9014 0.9984 1.1880 1.1658 2.6583
summary(v[1:10])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 4.00 5.00 6.50 6.40 7.75 9.00
summary(vi[1:10])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.0 4.0 5.5 5.5 7.0 10.0
summary(vii[1:10])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 1.25 2.00 2.30 2.75 5.00
hist(i[1:10])
hist(ii[1:10])
hist(iii[1:10])
hist(iv[1:10])
hist(v[1:10])
hist(vi[1:10])
hist(vii[1:10])
plot(density(i[1:10]))
plot(density(ii[1:10]))
plot(density(iii[1:10]))
plot(density(iv[1:10]))
plot(density(v[1:10]))
plot(density(vi[1:10]))
plot(density(vii[1:10]))
summary(i[1:20])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.3975 0.8964 2.2683 2.3167 3.2270 5.9957
summary(ii[1:20])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.62665 -0.62652 -0.10927 0.08884 0.75787 4.15662
summary(iii[1:20])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.1071 1.0204 1.1499 2.1036 2.2238 9.2422
summary(iv[1:20])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.4575 0.7868 0.9185 1.1802 1.3745 3.1243
summary(v[1:20])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3.00 5.00 6.00 5.80 6.25 9.00
summary(vi[1:20])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 3.00 4.50 4.85 7.00 10.00
summary(vii[1:20])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 1.00 2.00 2.35 3.00 5.00
hist(i[1:20])
hist(ii[1:20])
hist(iii[1:20])
hist(iv[1:20])
hist(v[1:20])
hist(vi[1:20])
hist(vii[1:20])
plot(density(i[1:20]))
plot(density(ii[1:20]))
plot(density(iii[1:20]))
plot(density(iv[1:20]))
plot(density(v[1:20]))
plot(density(vi[1:20]))
plot(density(vii[1:20]))
summary(i[1:30])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.3975 0.4987 1.8351 1.8947 3.0926 5.9957
summary(ii[1:30])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.6267 -0.4963 -0.0600 0.1540 0.7785 4.1566
summary(iii[1:30])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0318 1.0347 1.9062 2.5207 3.4851 9.2422
summary(iv[1:30])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.4575 0.6367 0.9071 1.0927 1.1658 3.1243
summary(v[1:30])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3.0 5.0 5.0 5.6 6.0 9.0
summary(vi[1:30])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0 3.0 5.0 4.9 7.0 10.0
summary(vii[1:30])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 1.000 2.000 2.633 3.000 10.000
hist(i[1:30])
hist(ii[1:30])
hist(iii[1:30])
hist(iv[1:30])
hist(v[1:30])
hist(vi[1:30])
hist(vii[1:30])
plot(density(i[1:30]))
plot(density(ii[1:30]))
plot(density(iii[1:30]))
plot(density(iv[1:30]))
plot(density(v[1:30]))
plot(density(vi[1:30]))
plot(density(vii[1:30]))
summary(i[1:50])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.3975 0.7493 1.8376 2.0769 3.5612 6.8499
summary(ii[1:50])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.6267 -0.6887 -0.0600 0.1486 0.9206 4.1566
summary(iii[1:50])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0318 1.0036 1.3980 2.3129 3.4851 9.2422
summary(iv[1:50])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.4289 0.7310 0.9779 1.1215 1.2746 3.1243
summary(v[1:50])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3.00 5.00 6.00 5.84 7.00 9.00
summary(vi[1:50])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 2.00 4.00 4.46 7.00 14.00
summary(vii[1:50])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 2.00 3.00 2.94 4.00 10.00
hist(i[1:50])
hist(ii[1:50])
hist(iii[1:50])
hist(iv[1:50])
hist(v[1:50])
hist(vi[1:50])
hist(vii[1:50])
plot(density(i[1:50]))
plot(density(ii[1:50]))
plot(density(iii[1:50]))
plot(density(iv[1:50]))
plot(density(v[1:50]))
plot(density(vi[1:50]))
plot(density(vii[1:50]))
# We can be infered from those density plots and histograms that the graph would become more smooth with sample size increasing.
summary(i)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -6.5984 0.5064 1.9752 1.9947 3.5057 10.6039
summary(ii)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -10.842620 -0.740901 -0.018082 -0.006291 0.718828 14.593535
summary(iii)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000107 0.588600 1.414373 2.030298 2.825117 22.332680
summary(iv)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.2827 0.7869 1.0451 1.1987 1.4160 10.1421
summary(v)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 5.00 6.00 5.98 7.00 9.00
summary(vi)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 3.000 4.000 4.993 7.000 28.000
summary(vii)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 2.000 3.000 3.027 4.000 13.000
hist(i)
hist(ii)
hist(iii)
hist(iv)
hist(v)
hist(vi)
hist(vii)
plot(density(i))
plot(density(ii))
plot(density(iii))
plot(density(iv))
plot(density(v))
plot(density(vi))
plot(density(vii))
# For discrete distribution, histogram always be the good choice. But for continuous distribution, histogram is good for small size, density plot is good for for large size.
# Comparing the 5-Number Summaries at each of the sample sizes for those distributions we can find that, for symmetric distribution, with the sample size increasing, the median and mean are getting closer. For skewed distribution, with the sample size increasing, the median become more far away from mean.